A Study of Touching Characters in Degraded Gurmukhi Text

نویسندگان

  • Manish Kumar Jindal
  • Gurpreet Singh Lehal
  • Rajendra Kumar Sharma
چکیده

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text. Keywords—Character Segmentation, Middle Zone, Touching Characters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmentation Problems and Solutions in Printed Degraded Gurmukhi Script

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper we have proposed a complete solution for segmenting touching characters in all the three zones of printed Gurmukhi script. A study of touching Gurmukhi cha...

متن کامل

On Segmentation of Touching Characters and Overlapping Lines in Degraded Printed Gurmukhi Script

Character segmentation plays a very important role in a text recognition system. The simple technique of using inter-character gap for segmentation is useful for fine printed documents, but this technique fails to give satisfactory results if the input text contains touching characters. In this paper, we have proposed two algorithms to segment touching characters, and one algorithm to segment o...

متن کامل

A Technique for Segmentation of Gurmukhi Text

This paper describes a technique for text segmentation of machine printed Gurmukhi script documents. Research in the field of segmentation of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectivity of characters on the headline, two or more characters in a word having intersecting minimum bounding rectangles, multicomponent characters, t...

متن کامل

A Complete Machine printed Gurmukhi OCR System

Recognition of Indian language scripts is a challenging problem. Work for the development of complete OCR systems for Indian language scripts is still in infancy. Complete OCR systems have recently been developed for Devanagri and Bangla scripts. Research in the field of recognition of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectiv...

متن کامل

Segmentation of Isolated and Touching Characters in Offline Handwritten Gurmukhi Script Recognition

Segmentation of a word into characters is one of the important challenges in optical character recognition. This is even more challenging when we segment characters in an offline handwritten document. Touching characters make this problem more complex. In this paper, we have applied water reservoir based technique for identification and segmentation of touching characters in handwritten Gurmukh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005